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Abstract. Let V : — ^ K'^ be a map defined by k positive definite quadratic 
forms on K". We prove that the relative entropy (Kullback-Leibler) distance from 
the convex hull of the image of to the image of ip is bounded above by an absolute 
constant. More precisely, we prove that for every point a = (ai, ... ,ak) in the convex 
hull of the image of ijj such that ai + . . . + = 1 there is a point b = (61, ... , 6^) 
in the image of xp such that foi + . . . + fofc = 1 and such that '^^—i ai In (ai/bi) < 4.8. 
Similarly, we prove that for any integer m one can choose a convex combination b of 
at most m points from the image of tp such that X^JLi fli In (ai/bi) < 15/\/rn. 



1. Introduction 

Let , . . . , Qfc : R"" — > M be quadratic forms and let ifj : M"" — > M.^ be the 
corresponding quadratic map, 

^(a^) = iQi{x), ■ . . ,qk{x)) . 

We are interested in the convex properties of the image V'(M'^) C R'^. The image is 
clearly convex when k = 1 and by the Dines Theorem it is convex when k = 2 (this 
and related facts can be found, for example, in Sections 11.12-14 of [Ba02] or in 
[PT07]). The image is not necessarily convex for /c > 3, though it remains convex 
for k — 3 if some linear combination of the forms qi , q2 and ^3 is positive definite. 

In this paper, we show that the image '0 (R"^) is close to its own convex hull 
conv (W^)) in some information-theoretic sense. 

Let 

a = (ai, . . . , afc) and 6 = (61, . . . , 6fc) 
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be two positive vectors such that 



k k 

1=1 i=i 

We interpret a and b as probabihty distributions and define the relative entropy of 
a with respect to b as 

k 

D(a\\b) = ^ailn 

i=l 

The quantity D{a\\b) is also known as the Kullback - Leibler distance from a to 6 
(although, generally speaking, il>(a||6) 7^ -D(6||a) and the triangle inequality does 
not hold). In particular, D{a\\b) > with equality if and only if a = 6, see for 
example, [CT06]. 

We prove that with respect to the Kullback - Leibler distance, the image •0 (M") 
of a quadratic map is reasonably close to its own convex hull conv {ip (W^)). 

(1.1) Theorem. Let qi,... ,qk : M"' — > M be positive definite quadratic forms 
and let V' : — > be the corresponding map, 

^(x) = (91 (a;),... ,qkix)). 

Let a G conv (-0 (M")) be a point, a = (04, . . . , a^), such that ai + . . . + Ofc = 1. 
Then there exists a point b & i/j (M"'), b — (61, . . . , bk), such that bi + . . . + bk = I 
and 

for some absolute constant f3 > 0. One can choose, for example, j3 = 4.8. 

We have undertaken some effort to optimize the constant /3, but its optimal value 
is not known at the moment and it would be interesting to find it. 

Loosely speaking. Theorem 1.1 asserts that replacing the image of t/^ by its convex 
hull leads to only a constant loss of information. The technique of semidefinite 
programming is based on replacing computationally intractable systems of quadratic 
equations and inequalities over the reals by computationally tractable systems of 
linear equations and inequalities in positive semidefinite matrices. This procedure 
is known as relaxation, see for example, [TulO]. The success of relaxation depends 
on the convex properties of the underlying quadratic maps, see [PT07]. Speaking 
even more loosely, one can speculate that the constant bound on the information 
loss in Theorem 1.1 explains the success of semidefinite programming. 

We also prove the following extension of Theorem 1.1. 
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(1.2) Theorem. Let qi,--- ,qk '■ — > ^ be positive definite quadratic forms 
and let i/j : — > be the corresponding map, 

= {qi{x),... ,qk{x)). 

Let a G conv (ifj (M")) be a point, a = (ai, . . . , ak), such that ai + . . . + ak = 1. 
Then, for any positive integer m, there exists a point b — {bi, . . . , bk), such that 
61 + . . . + 6fc = 1, the point b is a convex combination of at most m points of if^ {W^) 
and 



, [ ai\ 15 



< 

We note a useful inequality 

C(a||6) = X:a.ln(|) > ^Ah^'-" 

i=l ^ \i=l 

see, for example, Section 11.6 of [CT06]. The Approximate Caratheodory Theorem 
of Maurey (see [Pi81] and Section 1.3 of [Ve+]) states that if X is any set of points 
in the standard simplex 

k 

^ ] Xi = 1 and xi, . . . ,Xk > 

i=l 

in M'^ then any point a G conv(X) can be approximated within error of 1/y/m 
by a convex combination of m points of X in the £^ (Euclidean) norm. Theorem 
1.2 asserts that if X is the image of a quadratic map then one can get a similar 
approximation in the norm. 

The Johnson - Lindenstrauss Lemma implies that for any e > 0, if one chooses 
m = O (e~^ In/c) in Theorem 1.2 then one can ensure that 



bi 



< e for i = 1, . . . , k, 



see, for example. Sections V.5-6 of [Ba02] and [Ma08]. Theorem 1.2 asserts that if 
we measure the KuUback - Leibler distance, then the dependence on the number k 
of quadratic forms can be removed so that m = O (e~^) and 



D{a\\b)^Y.^aM{j^ < e. 



In the rest of the paper, we prove Theorems 1.1 and 1.2. In Section 2, we 
establish some general results on the distribution of values of a positive semidefinite 
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quadratic form with respect to the Gaussian probabihty measure in MJ^. In Section 
3, we consider the problem of maximizing a convex combination of logarithms 
of positive semidefinite quadratic forms on the unit sphere. We prove that its 
straightforward positive semidefinite relaxation produces a relative error bounded 
by an absolute constant. In Section 4, we complete the proof of Theorem 1.1. The 
proof of Theorem 1.2 given in Section 5 is a straightforward modification of our 
proof of Theorem 1.1. 

2. Quadratic forms and the Gaussian measure 

Un this section, we prove the following main result. 

(2.1) Lemma. Let us fix in R" the standard Gaussian probability measure fin with 
density 

Let q : M"^ — > M. be a positive semidefinite quadratic form such that 

Eq= 1. 

Then 

(1) We have 

E |lng| < 2.75; 

(2) For t > 1 let us define 
Then 

P {x : q{x) >t) < (j){t) for all t > 1. 

Proof. Part (1) is essentially proved in [Ba99] but we present its proof here for 
completeness. We have 

E lln^l < (E ln^?)^^^ 

We can write 

n 

(2.1.1) q{x)^^XiXi for x = {xi, . . . , Xn) 

in some orthonormal basis of MJ^ . Since 



Eq' = Ea;^ = l for i=l,...,n, 
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we have 

n 

(2.1.2) ^^Ai = l and also Aj > for i = l,...,n. 

Let 

r = e : q{x) < l}. 
By the concavity of the logarithm, 



/ i=l 



Since Ing(x) < for all x E Y, using (2.1.2) and the convexity of the function 
1 1 — > t^, we conclude that 

j \a q{x) diin{x) < J ^^A^lna;-^ diJ,n{x) 

< / V^Ailn^^n djinix) < / \ax\diJin{x) 

O f+OO 

= -p= / (In^ x) 6-"= dx < 6.55. 

On the other hand, since Int < for t > 1, we conclude that 

/ \\:? q{x) dp,n{x) < / q{x) diin{x) < / q{x) d/inix) = 1. 

Therefore, 

Eln^g < 6.55 + 1 = 7.55 and E jln^l < Vf^ < 2.75, 

which proves Part (1). 

Let us choose any a > 1. Applying the Markov inequality, we get 

P {x : q{x) >t) {x: g"(x) > t") < ^-^Eg". 

Writing q as in (2.1.1) and using (2.1.2) and the convexity of the function 1 1 — > t", 
we obtain 

J2>^^xl] < j2K^{xir 

i=l / i=l 

+ 00 ^ 2" / 



2tv Jo 0r V 2/ ' 



from which the proof of Part (2) follows. □ 

(2.2) Remark. The exact upper bound in Part (1) is not known to the author, 
though it looks plausible that it is attained on forms of rank 1 and hence is equal 
to 
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|lna;|e-^'/2 ^^a; ~ 1.76. 

^27r 

3. An optimization problem on the sphere 



(3.1) Notation. We consider the space Sym^ of nxn symmetric matrices endowed 
with standard inner product 

n 

{A,B) — aijbij — trace(74i?), 

where A = (aij) and B — ipij). For a vector x G M"", x — (xi, . . . , x^), we define a 
symmetric matrix X — x ® x., X — (xij), by Xij = XiXj. Thus a quadratic form q 
with matrix Q can be written as 

q{x) = {Q, x^ x) for all a; G M". 

We write X >: to say that X is positive semidefinite and X :^ to say that X is 
positive definite. 

In M", we consider the standard inner product 

n 

{^^ = X] ^iVi where x = {xi, . . . ,Xn) and y = {yi, . . . ,yn) , 
the corresponding norm 



= \/ {x,x), 
and the unit sphere 

In this section, we prove the following main result. 

(3.2) Theorem. Let ai, . . . , ccfe be non-negative reals such that cti + . . . + 0;^ = 1, 

let Qi, . . . ,Qk be n X n positive definite matrices and let qi, . . . ,qk '■ — > IR be 
the corresponding quadratic forms, 

qi{x) = {Qi, X ® x) for i = l,...,k. 

Then 

k k k 

max ailiiqi{x) < max N ailnlQi^X) < l3 + max ctj lngj(a;), 

«=1 trace(X) = l 
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where f5 > is an absolute constant. One can choose j3 = 4.8. 

Proof. For x e S""-*^ the matrix X = x ® x satisfies the constraints X >: and 
trace(X) = 1. Hence the first inequahty holds. 

Let ^ be a matrix where the maximum value of the function 



is attained on the set X of positive semidefinite matrices of trace 1. Rescaling 
Qi — > TiQi for some positive ti, . . . , r/s if necessary, we may assume that (Qi, ^) = 
1 for i = 1, . . . , /c and hence 

k k 
(3.2.1) max ^ ln(Qi, X) ^^Ui ln(Qi, A) = 0. 

trace(X) = l »=1 «=1 

Since A is positive semidefinite, we can write A = T'^ for some symmetric n x n 
matrix T. 

Let us fix the standard Gaussian probability measure /in in M"' with density 



-e 



(27r)V2 

and let a; e R" be a random vector. Then 

E ||Ta;|p = E {Tx, Tx) = E (T^a;, a;) = trace (T^) = trace(^) = 1. 
Hence by Part (2) of Lemma 2.1, 

(3.2.2) P {x : WTxW^ > 6) < 0(6) < 0.07 

(choosing ct = 3 in the definition of 4>{6), we obtain (f){6) < 5/72 < 0.07). 
Furthermore, 

Eqi{Tx) —{QiTx, Tx) = {TQiTx, x) = trace (TQ^T) 

= tT&ce{QiT^) = {Qi,A) = 1 for i = l,...,k. 

Therefore, by Part (1) of Lemma 2.1, 

E |lngi(ra;)| < 2.75 for i = l,...,k 

and hence 

k 

E "^ailnqiiTx) < 2.75. 



Therefore, by the Markov inequahty, 



(3.2.3) P (a;: ^ lnQ,(Ta;) < -3 J < < 0.92. 



From (3.2.2)-(3.2.3) we conclude that there is an x G \ {0} such that 



k 

\\Txf < 6 and ^ailngi(Ta;) > -3. 

i=l 



Then for 

Tx 




we have 

k 

yeS""-^ and ^ailiiqi{y) > -3 - ln(6) > -4.8, 

and, in view of (3.2.1), the proof foUows. 

4. Proof of Theorem 1.1 

Proof. Let us write 

qi{x) = {Qi, X <Si x) for i = l,...,k, 

where Qi, . . . ,Qk are n x n positive definite matrices. Let 

k 

Thus S >~ ^ and hence there exists an invertible symmetric matrix T : R" 
such that S = T'^. Let us define new matrices 

Qi^T-^QiT-^ for i = l,...,k, 

the corresponding quadratic forms 

qi{x) = {Qi, x®x) = {Qi, T-^x ® T-^x) = qi (T'^x) for i = 1, . . 

and the map ip : — > M'^, 

i^ix) = (qi,... ,qk). 



Clearly, V (R") = $ (W) and 

k 

Hence, without loss of generality, we can assume that 

k 

(4.1) J2q^=^- 

Since a e conv {ip (IR"')), we can write 

(^i = {QiiX) for i = l,...,/c 

and some X ^ 0. Moreover, in view of (4.1), we have 

k Ik \ 

1 = J]oi= /j^Q,, x\ = {I, X)=trace(X) 

i=l \i=l / 

We note that 

k k 

Oi ln((5i, -'^) = X] Oi In Oj. 

By Theorem 3.2, there is an x G S"^~^ such that 

k k 

/3 + ^ailiiqi{x) > ^ttilnoj. 
i=i 1=1 

Letting 

bi = qi{x) for z = 1, . . . , A;, 

we conclude that 

fc k 

6i = ^^(<5i, X ® x) — {I ,x ® x) — trace(a; ® x) 

and that 

^ In ( j = ^ Oj In - ^ aj In 6i < ^. 



Moreover, for 6 = (6i, . . . , hk) we have 6 = il^{x)^ so b E ip (R"). 
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5. Proof of Theorem 1.2 
(5.1) Lemma. For a positive integer m let us consider MJ^^ as the direct sum 



= © . . . e R" 

" V ' 

m times 



Let us fix the standard Gaussian probability measure /x^ in and consider the 
standard Gaussian probability measure fXrnn R"^"' as the direct product 

Let q : M"^ — > W be a positive semidefinite quadratic form and let us define a 
quadratic form qm ' R"^" — > R by 

qmixi,... ,Xni) = —y^q{xi) where x = {xi, . . . ,Xm) 

1=1 

and Xi e R" for i = 1, . . . , m. Suppose that 

Eq= 1. 

Then 

(1) For allt>l we have 

P(xem'^'': qmix)>t^ < exp|^(l-t + lnt)}; 

(2) For alio <t <1 we have 

p(xeR"''': qm{x)<t^ < exp|^(l-t + lnt)}; 

(3) We have 

E \lnqm\ < 



m 

Proof. We use the Laplace transform method, see also [HW71]. Since 

Bq=l, 

in some orthonormal basis of R" we can write 



i=l 
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and 

n 

(5.1.1) ^Ai = l and Aj > for z = l,...,n. 



Writing vectors x e R"^" as a; = (Cn, . . . , Cin, C21, • • • , C2n, • • • , Cmi, • • • , Cmn), we 
write 



-j^ lb III 

For any < a < m/2 we have 

P (^x e M""" : g^(a;) > =P (x e E"'" : e^'^-^^^ > e"^) < e'^^Ee 



Since the function 



m 



(A,....,A„)^--j:in(l-— 



is convex, it attains its maximum on the simplex (5.1.1) at a vertex Aj = 1, Xj = 
for j i. Therefore, 

-m/2 

P (x e 



(x e R'"" : qm{x) > < e"'^* (^1 - 



Optimizing on a, we choose 

m f t — 1 
2 

and the proof of Part (1) foUows. 
For any a > we have 



P(xe W^"" : qm{x) < =P (x e K™" : e'"'^"^^^^ > e""^) < e^^Ee"""" 

n , 



Since the function 



(A,...,A„)^--i:in(l + — 
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is convex, it attains its maximum on the simplex (5.1.1) at a vertex = 1, Xj = 
for 3 ^ i. Therefore, 

-m/2 

P e 



qra{x)<t) < e'^*(l + ^) 



Optimizing on a, we choose 



m fl — t 
a= — 



2 V t 



and the proof of Part (2) follows. 
Let us define 



X+ = |a; e M™^ : qm{x) > l} and X_ = |a; e M""" : qm{x) < l} 

Then 

llnq'^l = / In qm{x) diimn{x) - / \nqm{x) d/imnix) 
Jx+ Jx_ 



E 



By Part (1), 



j \D.qm{x) dumnix) = J P (x : lnqm{x) > dt 

= J P(x: qm{x) > e*) dt 

< exp{-(l-e^ + t)}dt < exp|-^| 




TT 

m 



By Part (2), 

/ - In. qm{x) djimnix) = I Pix: -\nqm{x)>t^ 



Now, 

^^°°exp{^(l-e-*-t)} dt=l\^p{^{l-e-'-t)} dt 

+ ^^~exp{^(l-e-*-t)} dt 



~ 2m m 



Summarizing, 



, TT / Stt 2 6 
E lln^^l < \— + \— + — < 



m V 2m m 

and the proof of Part (3) follows. □ 

(5.2) Theorem. Let cti, . . . , ctfc he non-negative reals such that ai + . . . + = 1, 
let Qi, . . . ,Qk be n X n positive definite matrices and let m be a positive integer. 
Then 

k k 

max > ailii{Qi,X) < max > ai\ii{Qi,X) 

trace(X)=l *=1 trace(X) = l 

rank X<m 



< 15 



— = + max y^Q;jln((5j,X). 

trace(X) = l »=1 



rank(X)<m 



Proof. The first inequality obviously holds. 

Let ^ be a matrix where the maximum value of the function 

k 

is attained on the set X of positive semidefinite matrices of trace 1. Rescaling 
Qi — > TiQi for some positive ti, . . . , if necessary, we may assume that {Qi, A) = 
1 for z = 1, . . . , A; and hence 

k k 

(5.2.1) max ^ a, ln(g„ X) = ^ a, ln(g„ A) = 0. 

trace(X) = l *=1 »=1 
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Since A is positive semidefinite, we can write A = T'^ for some symmetric n x n 
matrix T. 

Let us fix the standard Gaussian probabifity measure fi^ in with density 



-||^llV2 



(27r)V2 

and let xi, . . . , Xm € IR"' be m independent random vectors. Then 

E \\Txjf = E {Txj, Txj) = E {T'^Xj, Xj) = trace (T^) = trace(A) = 1. 
Applying Part (1) of Lemma 5.2, we conclude that 



xi,... : - Y^\\Txj\ 



> 1 + 



< exp <^ — } < 0.33 
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(we use that ln(l + s) < s - for < s < 1). 
Let us define quadratic forms 



qi{x) — {Qi^x ® x) for i = l,...,A;. 



Then 



^qi(Txj) ={QiTxj, Txj) = {TQiTxj, xj) = trace (TQiT) 
= trace (QiT^) = {Qi, A) = 1 for i = l,... ,k. 

Therefore, by Part (3) of Lemma 5.1, 



E 



and hence 



1^ (I^E^^(^^^)j 

k / 1 \ 



for i = 1, . . . ,k 



< 



m 



Therefore, by the Markov inequality. 



(5.2.3) P lxi,...,Xr 



k / 1 A 12 \ 

14 



0.5. 



From (5.2.2)-(5.2.3) we conclude that there are points xi,. . . , Xm eW^\ {0} such 
that 



^ m 2 k I ^ m \ 

- y; \\Txj ||2 < 1 + and y; a, m - y; qiiTxj) 
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> - 



m 



Let us define a matrix Y by 



-1 

m 



Then 

y ^ 0, trace(y) = 1, ranky < m 



and 



k k j ^ m \ j ^ m \ 



^ 12 3 \ 15 

> ^ - In 1 + —= > - 



and, in view of (5.2.1), the proof follows. □ 

(5.3) Proof of Theorem 1.2. As in the proof of Theorem 1.1 in Section 4, 
without loss of generality we assume that 

k 

(5.3.1) E^^ = ^- 

1=1 

Since a e conv (V' (K*^)), we can write 

O'i^iQi.X) for i = l,...,/c 
and some X >zO. Moreover, in view of (5.3.1), we have 

k Ik \ 

l = Y,^^ = {y^Q^, X\ = {I, X)=trace(X). 
i=i \i=i I 

We note that 

k k 

y ai ln((5i, X) = ^ ttj In Oj. 
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By Theorem 5.2, there is a n x n symmetric matrix Y, such that Y >z 0, ranky < m 
and 

k k 



L y^aj ln(Qj,y) > V'ajlnaj. 
i=i i=i 



Let 
Then 



bi = {Q„Y) for i = 



Y.^^={Y.Q^^^)= trace(y) = 1. 



i=i \i=i 
Since ranky < m, we can write 



m 

3 = 1 



for some yi, ■ ■ ■ ,ym G 1^"- Then 



^ m 

6i = — gi (yj) for z=l,...,A; 

and 6 is a convex combination of at most m points from ip (MJ^). □ 
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